A Machine Learning Approach to Linking FOAF Instances

نویسندگان

  • Jennifer Sleeman
  • Timothy W. Finin
چکیده

The friend of a friend (FOAF) vocabulary is widely used on the Web to describe individual people and their properties. Since FOAF does not require a unique ID for a person, it is not clear when two FOAF agents should be linked as coreferent, i.e., denote the same person in the world. One approach is to use the presence of inverse functional properties (e.g., foaf:mbox) as evidence that two individuals are the same. Another applies heuristics based on the string similarity of values of FOAF properties such as name and school as evidence for or against co-reference. Performance is limited, however, by many factors: non-semantic string matching, noise, changes in the world, and the lack of more sophisticated graph analytics. We describe a supervised machine learning approach that uses features defined over pairs of FOAF individuals to produce a classifier for identifying co-referent FOAF instances. We present initial results using data collected from Swoogle and other sources and describe plans for additional analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Co-reference Relations for FOAF Instances

FOAF is widely used on the Web to describe people, groups and organizations and their properties. Since FOAF does not require unique IDs, it is often unclear when two FOAF instances are co-referent, i.e., denote the same entity in the world. We describe a prototype system that identifies sets of co-referent FOAF instances using logical constraints (e.g., IFPs), strong heuristics (e.g., FOAF age...

متن کامل

Computing FOAF Co-reference Relations with Rules and Machine Learning⋆

The friend of a friend (FOAF) vocabulary is widely used on the Web to describe ’agents’ (people, groups and organizations) and their properties. Since FOAF does not require unique ID for agents, it is not clear when two FOAF instances should be linked as co-referent, i.e., denote the same entity in the world. One approach is to use logical constraints such as the presence of inverse functional ...

متن کامل

An Effective Approach for Robust Metric Learning in the Presence of Label Noise

Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...

متن کامل

Scalable Relational Learning for Sparse and Incomplete Domains

The Semantic Web (SW) presents new challenges to statistical relational learning. One of the main features of SW data is that it is notoriously incomplete. Consider friend-of-a-friend (FOAF) data. The purpose of the FOAF project is to create a web of machinereadable pages describing people, their relationships, and people’s activities and interests, using SW technology. Obviously people vary in...

متن کامل

A Hybrid Machine Learning Method for Intrusion Detection

Data security is an important area of concern for every computer system owner. An intrusion detection system is a device or software application that monitors a network or systems for malicious activity or policy violations. Already various techniques of artificial intelligence have been used for intrusion detection. The main challenge in this area is the running speed of the available implemen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010